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In a recent study of the thermodynamic restrictions of a theory of compressible, viscoelastic fluids, 
Fong and Simmons (ZAMP 23, No. 5 (1972)) encountered a problem of integrating the following 
matrix identity: 

M[HUAH T MH)H T -U,c(M)] - HU, C (H T H)H T = O, 

where U, c denotes the gradient of the scalar-valued function U=U(C) with respect to its matrix 
argument C which is symmetric and positive-definite. The identity is valid for every symmetric positive- 
definite M and every unimodular H. The symbol H 1 denotes the transpose of the matrix H. The solution 
of the problem is presented here in detail as an example of applying, probably for the first time, Schur's 
lemma on irreducible sets of matrices in theoretical continuum mechanics. 
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1 . Introduction 

Continuum mechanics, or the mechanics of a deformable medium, depends heavily on the use 
of standard results in matrix theory for the formulation of problems and their solutions. For example, 
a "hyperelastic material" is characterized by the following constitutive equation when the thermal 
variables are ignored: 

T=pFa h -(FV, T* m =pFl-^. (1.1) 

or a 

Here T is the Cauchy stress tensor with a matrix representation jT* , which specifies the actual 
contact force per unit area in the spatial coordinate system x k , k = 1,2,3. The symbol p stands for 
the mass density per unit volume associated with a particle at x k . To define F, the deformation gradi- 
ent tensor with a matrix representation 'F*, we need to introduce a reference configuration k with 
respect to which each material particle is given a coordinate label X a , a= 1, 2, 3. The deformation 
gradient matrix is then defined as F k = dx k (X (i )ldX ( \ The scalar function cr(F) is called the strain- 
energy function of the hyperelastic material. The symbol a F stands for the gradient of cr with 
respect to F, and is, therefore, itself a matrix with its transpose denoted by cr F (F) T . Equation (1.1) 
states that the response of a hyperelastic material is completely determined for a given set of values 
of p andF, provided the form of the scalar function cr can be determined experimentally. As it stands, 
cr depends on a 3 X 3 matrix variable or a total of nine components of the matrix F k . A combination 
of physical requirement (strain energy must be frame-indifferent), and a standard result in matrix 
theory (polar decomposition of F into a product of an orthogonal R and a symmetric U) reduces the 
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number of variables in the function <x from nine to six, i.e., cr(F) = cr(U). For additional examples, 
see, e.g., Truesdell and Noll [l], 1 Murnaghan [2], etc. 

In a recent study of the thermodynamic restrictions of a theory of compressible, viscoelastic 
fluids, Fong and Simmons [3] encountered the problem of integrating the following matrix identity: 

M [HU[ £ (H T MH)H T -lf\c(M)] - HU, £ (H T H)H T = £ , (1.2) 

where £/, c ' denotes the gradient of a scalar-valued function U=U(C) with respect to its matrix 
argument C which is, by definition, symmetric and positive-definite. 2 . The identity is valid for every 
symmetric, positive-definite M and every unimodular //, i.e., det H = 1. It turns out that the in- 
tegrability of (1.2) depends in a crucial way on two basic results in matrix theory. The purpose of 
this expository paper is to bring to the readers' attention these results which are well-known to 
mathematicians but not necessarily to workers in continuum mechanics: 
Fact I The set of all symmetric, positive-definite matrices is irreducible. 

Fact 2 If a matrix Y commutes with each matrix of an irreducible set, then Y is a scalar matrix, 
i.e., Y=AJ. 

In section 2, the notion of "reducibility" of a set of matrices is first defined. A proof relating the 
notion of "reducibility" with that of an invariant subspace is also given. In section 3, we prove 
"fact 1" with a scheme of reasoning essentially due to Newman [4]. In section 4, we begin with 
Schur's lemma on irreducible sets of matrices and use it to prove "fact 2." The integration of (1.2) 
using both facts 1 and 2 is given in section 5. Finally, a discussion of the significance of the new 
result appears in section 6. 

2. Reducibility of a Set of Matrices 

We reproduce here the formal definition of the notion of "reducibility" of a set of matrices, 
si = {A( n x »)}, over the complex field, as presented by Newman [5]. The set si is said to be reducible 
if there exists a fixed, nonsingular matrix S( n xn) and fixed positive integers p, q, such that for each 
Am si, 

S-MS=(§-> I ) (2.1) 

The symbol O represents a block of zeros with, of course, p rows and q columns. If no such S can 

be found, the set si is said to be irreducible. Examples of reducible and irreducible sets of matrices 

are: 

Example 2.1 The set consisting of a single n X n matrix A alone, n > 1, is reducible. 

Example 2.2 The set consisting of all 2X2 matrices of the form ( y x j is reducible. 

Example 2.3 The set consisting of all n X n matrices of the form 1= =1 with respect to some 
fixed partitioning is reducible. = = 

Example 2.4 The set of all n X n column stochastic matrices having all column sums equal to 1 
is reducible. 

Example 2.5 The set .<4—\ ( n i)' \i ill 1S irreducible. 

Additional examples of irreducible sets will be given in the next section. For those reducible sets 
given in the above examples, the reader can find the corresponding fixed matrix S in the book by 



1 Figures in brackets indicate the literature references at the end of this paper. 

2 The scalar function U, as it appears in reference [3]. also depends on a scalar parameter £, i.e., U—U{C, £)• For our purposes here, this dependence is sup- 
pressed for brevity. 
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Newman [5|. We now wish to interpret the notion of reductibility by proving its equivalence to the 
existence of an invariant subspace: 

Remark 2.1 Let V be the ra-dimensional vector space over the complex field, and let ^={A( n xn)} 
be a set of matrices which is reducible. Then there exists a subspace IF in J 7 such that JFis invariant 
under any sequence of transformations given by the matrices in the set .of. 
PROOF: Interpreting matrices in s$ as transformations, we have 



L" 1 (| f ) s 



(2.2) 



valid for each v in V and each ,4 in the reducible set M '. Consider the set of vectors of the form 



w - 



0(pxl) 
7(9X1) 



(2.3) 



Then we can define a subspace W consisting of all the m?'s with the property that W is invariant 
under matrix transformations of the type 



O 



is necessarily an element ofW. We now wish to show that W is also invariant under the set J^. Let 
S~ l w = u, i.e.,S u= w, andS U\ = W\. Then we have 



£-1 §)^-'(§ fc-t 



Since a fixed transformation matrix S when applied to all the vectors in a subspace W does not alter 
the collection of vectors in that subspace, we conclude that the reducibility of . / implies the exist- 
ence of an invariant subspace W under .W . 

Remark 2.2 Given a subspace W of the ^-dimensional vector space V and given a set of matrices 
J$={A(nxn)} under which W remains invariant, then the set si is reducible. 

PROOF: Let the subspace W be of dimension q, q < n, and let w be any vector in W. Then there 
exists a linear transformation with matrix S such that every vector w can be brought to the form: 



M ;=S^ (px,) 



— \y<gxl) 



(2.4) 



The condition that W is invariant under si implies {Aw} C {w}. Substituting the representation 
of w as given in (2.4) into the condition of invariance, we get 



as(° 

l==\y> 



4©l 



„.,{,-.,,©} 



(2.5) 



B 0\ 

The statement (2.5) implies S~ X AS must be of the form ( t; t,). Hence the set si is reducible. 
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Combining remarks 2.1 and 2.2, we arrive at the following useful result: 
Remark 2.3 A set of square matrices, M= {A( nX n)} , over the complex field is reducible, if, and 
only if, there exists an invariant subspace under £$ . 

3. Irreducibility of the Set of Symmetric, Positive-Definite Matrices 

We now wish to use remark 2.3 to show whether a given set of matrices is reducible or not. 
The following remark is due to Newman [4] : 

Remark 3.1 The set of matrices consisting of a diagonal matrix with nonzero and distinct eigen- 
values, i.e., diag (X x , X 2 , • • • , X n ), X; 7^ Xj for i ^ j, 1 ^ 1, j^ n, and a special matrix /= Utj), 
with Jij= 1, 1 ^ i, J ' ^ n, is irreducible. 

PROOF: Let x = {xu #2? • . ., */*) T be a nonzero vector in the ^-dimensional vector space V. 
Let D be the diagonal matrix with nonzero, distinct eigenvalues, Ai, . . . , A„. The proof for re- 
mark 3.1 can be broken into three steps as follows: 

Step 1 For x ¥^ 0, there exists a positive integer k such that D k x ^ 0. Suppose the statement is 
false, i.e., D [ x—0 for 1 ^ i ^ n. Then it is possible to have the system: 



\lX\ + \ 2 X2+ + \ n X n = 

(3.1) 



X 2 x, + \ 2 ,x.,+ + \ 2 x = 

112 2 n n 



,A** 1 + A»x 2 + +K X n = °- 

The system (3.1) has a nontrivial solution based on the hypothesis that x ^ 0. Hence the determinant 
must vanish, contrary to the well-known result that a determinant of the form: 

^iX 2 . . . X„ 

x»x» . . . a; 

A=l )=A 1 A 2 ...A M II (Aj-Xi)#0, (Xj^Xj-^0, i*j) 



1 s= / < 7 s£ n 

<X»X" . . . X"> 

x 1 2 nt 

is never zero. Hence for x # 0, there exists A: such that 

X?x 1 + X!* 2 + . . . + A£s n ^0. 

Step 2 We now calculate the vector JD k x and conclude that it is equivalent to the vector y = 
(1, 1, . . . , l) T up to a nonzero scalar multiplying constant. Let us now calculate Dy, D 2 y, 
. . . . , D n ~ l j, and obtain the following set of vectors: 



r=(i, 1, . . . ,D T , 

Dy=(X 1 ,X 2 , . . . , X n ) r , 
D 2 y=(AJ,Af, . . . ,AJ)r, 



Since the determinant is the well-known Vandermondian which is nonzero as long as the X ? are 
distinct, we conclude that the above form a linearly independent set of n vectors and span the space. 
Step 3 Since the set j/ = {D, J} of matrices when applied to a nonzero vector x generates the 
entire space, there is no proper subspace invariant under sd . Hence, by remark 2.3, the set s& is 
irreducible. 
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Since both matrices D and / as defined in remark 3.1 are symmetric, it is trivial to conclude 
that, 
Remark 3.2 The set of all symmetric matrices of any order over the complex field is irreducible. 

Since any symmetric matrix can be made into a positive-definite symmetric one by the addition 
of a scalar matrix «/, where ol is any nonzero scalar, and / is the identity matrix, and it is easy to 
show that such an addition does not affect the property of reducibility of a given set of symmetric 
matrices, we conclude that, 

Remark 3.3 The set of all symmetric, positive-definite matrices of any order over the complex 
field is irreducibile. (This was stated earlier as ''Fact 7.") 

4. Shur's Lemma on Irreducible Sets of Matrices 

We reproduce here the celebrated Schur's Lemma as stated in [5], p. 3: 
THEOREM (Schur's Lemma): Let jtf={A}, &={B} be irreducible sets o/nXn matrices, mX m 
matrices respectively. Let M be a fixed m X n matrix which determines a 1 — 1 correspondence 
between $£ and & such that MA = BM. Then either M=0, or m=n and M is nonsingular. 

The proof for the above theorem is given in [5] and is omitted here for brevity. It is, however, 
instructive to repeat the proof for an important corollary as follows: 

COROLLARY: If a matrix Y commutes with each matrix of an irreducible set .$/, then Y is a scalar 
matrix, i.e., Y=XI. 

PROOF: Let A. be any eigenvalue of Y. Then Y—kl is singular. It is easy to see that the matrix 
Y—XI also commutes with each matrix of J$ . Schur's lemma now implies that Y—kl must be 0. 
Hence Y= XI. 

This corollary was referred to earlier in the introduction as "Fact 2. " 

5. Integration of the Matrix Identity (1.2) 
Let us rewrite (1.2) by introducing Y=HU,c(H T H)H T : 

gUAWMK)W-U,c(M) = M-%. (5 1} 

Since U depends on a symmetric argument, the gradient U lC \s necessarily symmetric. This implies 
Y is symmetric as well as the left-hand side of (5.1). Since M is symmetric and positive-definite, 
M~ x is also symmetric and positive-definite. The identity (5.1) tells us that the product of two 
symmetric matrices, M l and Y is also symmetric. A standard result in matrix theory says that 
the necessary and sufficient condition for the product of two symmetric matrices to be again 
symmetric is that the two matrices must commute. Hence M~ l Y=YM l . Since (5.1) is true for 
every positive-definite, symmetric M, and since the set of all positive-definite, symmetric matrices 
is irreducible (Fact 1), the matrix Y must be a scalar. This completes the application of the corollary 
to Schur's lemma as stated in the last section (Fact 2). 

With the matrix Y assuming the form of a scalar matrix k(H)I, we can write every term in 
(5.1) in the form of a total differential, i.e., 

tr {HU,c(H f MH)H r dM} = d{U(H T MH)), 

(5.2) 

lr{U,c(M) dM} = d(U(M)), (5.3) 

tr{A(//)M-' dM} = d(k(H) In (det M)). (5.4) 

For an exposition of the use of matrix notation in the calculus of differentiable functions whose 
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arguments are square matrices, see e.g., [6]. For our purposes here, it is sufficient to list the follow- 
ing formulas with which (5.2), (5.3), and (5.4) are derived: 

Given e = i(A) , we have de =tr [i^dA]. (5,5) 

Given i(A) = detA , we have €,/= (det^) id' 1 ) 7 - (5.6) 

Given i(A) = 4> (gAD), we have ^A = & T <t>, (bad)D t . (5.7) 
The identity (5.1) can now be integrated without difficulty: 

U(lfMH) ~ U(M) = \(HJ In (det M) + P(H). (5.8) 

Detailed steps for the determination of the scalar functions ^(H) an d j8(f), are given in [3]. It is 
the main purpose of this expository article to demonstrate that without the mathematical results 
in the theory of irreducible sets of matrices, an identity on the scalar function U in the form (5.8) 
would not have been derived. 

6. Significance of New Result Based on Identity (5.8) 

An important contribution an applied mathematician can make in the field of science and 
engineering is to reduce the total number of variables in a given problem through a series of rigorous 
arguments, each of which can be further examined for its consistency with experiments. An 
example of this was given in the introduction of this paper where the strain energy function depends 
on six components of a symmetric matrix U instead of the nine components of the matrix F. It is 
not surprising to many mathematicians that further simplification is possible by having the strain 
energy function to depend only on the three principal invariants of the symmetric matrix U. The 
physical basis for the reduction of the number of variables from six to three is known as the condi- 
tion of isotropy, where the hyperelastic material responds to an arbitrary deformation with no 
preference to its own orientation in an undistorted state. 3 A rigorous characterization of an isotropic, 
hyperelastic material requires the experimental determination of a strain energy function, say, 
W= W(L\, L 2 , L3), where L\ t Li, L 3 are some special combinations of the eigenvalues of the sym- 
metric matrix U. Recently Penn [7] reported the results of a series of experiments on the de- 
formation of a peroxide vulcanized, pure-gum, natural rubber. He concluded from his experiments 
that the strain energy function, in general, cannot be separated as a sum of two parts: 

W=W(L U L 2 ,L 3 )*F(L 1 ,L 2 ) + G(L 3 ). (6.1) 

In attempting to explain this experimental result, Fong and Simmons [3] studied the thermo- 
dynamic restrictions of a theory due to Bernstein, Kearsley, and Zapas [8, 9]. The theory was 
motivated by that of hyperelastic materials by replacing, arnong^ other things, the strain energy 
function W with a more general, time-dependent function, U=U(C(t, r), t — r), where t and r 
denote, respectively, the present and some past time between — °° and t. An identity as given in 
(1.2) on the gradient of the function U was derived, and Schur's lemma was applied in arriving at 
the identity (5.8) on U. The significance of (5.8), as discussed in [3] and [10], is best described in 
terms of a decomposition result based on (5.8): 

U(L U L 2 , L 3 , Q=F(L U L 2 , Q+G(L S , Q+H(L l9 L i9 lnL 3 . (6.2) 



'For a thorough treatment of the notion of isotropy, see, e.g., Truesdell & Noll [1] 
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Since the Bernstein-Kearsley-Zapas' theory is known to describe responses of hyperelastic ma- 
terials for some special forms of U, it is conceivable that Perm's data [7J can be explained with 
an analogous decomposition on the strain energy function W: 

W=W(LuL2 9 L z )=F(L l9 L 2 )+G(L s )+H(Lu L t ) In/.,. (6.3) 

Further significance of the reduction of the form of W as stated in (6.3) will appear in a forthcoming 
paper [10]. 



I thank E. A. Kearsley, M. Newman, H. J. Oser, R. W. Penn, and L. J. Zapas, for their generous 
help and critical comments in the preparation of this exposition. 
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